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In some cases the library may be assembled directly from existing or wild-type 
genes. In a preferred embodiment however the DNA library is assembled from the fusion 
of two or more sub-libraries. By the in-frame ligation of the sub-libraries, it is possible to 
create a large number of novel genetic constructs encoding useful targeted glycosylation 
activities. For example, one useful sub-library includes DNA sequences encoding any 
combination of enzymes such as sialyltransferases, mannosidases, fucosyltransferases, 
galactosyltransferases, glucosyltransferases, and GlcNAc transferases. Preferably, the 
enzymes are of human origin, although other mammalian, animal, or fungal enzymes are 
also useful. In a preferred embodiment, genes are truncated to give fragments encoding 
the catalytic domains of the enzymes. By removing endogenous targeting sequences, the 
enzymes may then be redirected and expressed in other cellular loci. The choice of such 
catalytic domains may be guided by the knowledge of the particular environment in 
which the catalytic domain is subsequently to be active. For example, if a particular 
glycosylation enzyme is to be active in the late Golgi, and all known enzymes of the host 
organism in the late Golgi have a certain pH optimum, then a catalytic domain is chosen 
which exhibits adequate activity at that pH. 

Another useful sub-library includes DNA sequences encoding signal peptides that 
result in localization of a protein to a particular location within the ER, Golgi, or trans 
Golgi network. These signal sequences may be selected from the host organism as well 
as from other related or imrelated organisms. Membrane-bound proteins of the ER or 
Golgi typically may include, for example, N-terminal sequences encoding a cytosolic tail 
(ct), a transmembrane domain (tmd), and a stem region (sr). The ct, tmd, and sr 
sequences are sufficient individually or in combination to anchor proteins to the inner 
(lumenal) membrane of the organelle. Accordingly, a preferred embodiment of the sub- 
library of signal sequences includes ct, tmd, and/or sr sequences from these proteins. In 
some cases it is desirable to provide the sub-library with varying lengths of sr sequence. 
This may be accomplished by PGR using primers that bind to the 5' end of the DNA 
encoding the cytosolic region and employing a series of opposing primers that bind to 
various parts of the stem region. Still other useful sources of signal sequences include 
retrieval signal peptides, e.g. the tetrapeptides HDEL (SEQ ID N0:5) or KDEL (SEQ ID 
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N0:6), which are typically found at the C-terminus of proteins that are transported 
retrograde into the ER or Golgi. Still 
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other sources of signal sequences include (a) type 11 membrane proteins, (b) the enzymes 
listed in Table 3, (c) membrane spanning nucleotide sugar transporters that are localized 
in the Golgi, and (d) sequences referenced in Table 5. 



Table 5. Sources of useful compartmental targeting sequences 



Gene or Sequence 


Organism 


Function 


Location of Gene 
Product 


MnsI 


S. 

cerevisiae 


a- 1 ,2-mannosidase 


ER 


OCHl 


S. 

cerevisiae 


1,6- 

mannosyltransferase 


Golgi (cis) 


MNN2 


S. 

cerevisiae 


1,2- 

mannosyltransferase 


Golgi (medial) 


MNNl 


S. 

cerevisiae 


1,3- 

mannosyltransferase 


Golgi (trans) 


OCHl 


P. pastoris 


1,6- 

mannosyltransferase 


Golgi (cis) 


2,6 ST 


H, sapiens 


2,6-sialyltransferase 


trans Golgi network 


UDP-Gal T 


5. pombe 


UDP-Gal transporter 


Golgi 


Mntl 


S, 

cerevisiae 


1,2- 

mannosyltransferase 


Golgi (cis) 


HDEL (SEQ ID 
N0:5) at C- 
terminus 


S, 

cerevisiae 


retrieval signal 


ER 



In any case, it is highly preferred that signal sequences are selected which are 
appropriate for the enzymatic activity or activities which are to be engineered into the 
host. For example, in developing a modified microorganism capable of terminal 
sialylation of nascent A^-glycans, a process which occurs in the late Golgi in humans, it is 
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• # 

desirable to utilize a sub-library of signal sequences derived from late Golgi proteins. 
Similarly, the trimming of Man8GlcNAc2 by an a-l,2-mannosidase to give Man5GlcNAc2 



29 



